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PRODUCTS OF RANDOM MATRICES: DIMENSION AND 

GROWTH IN NORM 

By Vladislav Kargiisl: 

Abstract 

Suppose that Xi, . . . ,Xn, ■ ■ ■ , are independent, identically-distributed, rotationally 
invariant N x N matrices. Let n„ = X„ . . . Xi . It is known that log ||n„ || converges 
to a non-random limit. We prove that under certain additional assumptions on matrices 
Xi the speed of convergence to this limit does not decrease when the size of matrices, 
A'", grows. 

1. Introduction. Let Xi be a sequence of independent N x N random 
matrices and n.„ = Xn...Xi. In a celebrated paper [I], Furstenberg and 
Kesten proved tliat n~^log||n„|| converges provided that E'log^ (||Xj||) < 
oo. Later, Oseledec in [7j proved convergence for other singular values of 
n„, and Cohen and Newman in [Ij studied the behavior of the limit in the 
situation when N approaches infinity. This paper investigates the question 
of how the speed of convergence depends on the dimension of matrices A^. 

Consider a dynamical system (a gas, an economy, an ecosystem, etc.). 
Its evolution can be described by a mapping ip^ Xi (ipi) , where ipi is a 
vector that describes the state of the system at time i. We can often model 
the mapping as a multiplication by a random matrx Xi. Stability and other 
long-run properties of the system depend on the growth in the norm of the 
product n„ = Xn . . . X, which we can measure by calculating the quantity 
n-ilog(||n„||). 

The sub-multiplicativity property of the norm (||X2Xi|| < ||X2|| ||Xi||) 
ensures that log (||n„||) converges to E'logHXiuH , where u is an arbi- 
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trary vector. Intuitively, this means that it is not important what was the 
starting vector of the system. After some time, all products grow at the 
same rate independently of the initial state. 

It is of interest to investigate whether this erasure of memory about the 
initial state occurs slower in more complex systems, that is, in systems, 
which are described by matrices of larger size. 

Of course, when we compare long-run properties of systems, we should 
only look at the systems that are comparable in the short run, that is, the 
system that have comparable one-step behavior. Roughly, the difference be- 
tween one-step growth of a specially-chosen and a random vector can be 
measured by the ratio of ||Xi||^ to N''^tr {X^Xi) , where N is the dimen- 

1 1 2 

sion of the matrix Xi. Indeed, ||^i|| is the square of the maximal possible 
increase in the length of the state vector, and N~^tr {XfXi) is the average 
of the squared singular values of Xi , hence it can be considered as a measure 
of the increase in the length of a random state vector. 

Hence, if we want systems to be comparable in the short run, then we 
should restrict this ratio by a constant that does not depend on the di- 
mension of the system. We will call this property uniform boundedness of 
singular values. 

We also want to look at sufficiently symmetric systems, that is, systems 
without preferential directions. We codify this by requiring that matrices Xi 
are rotationally invariant, that is, the distribution of matrix elements does 
not depend on the choice of basis. 

The main result of this paper is that under these assumptions the spcecd 
with which the memory of the initial state is erased does not decrease as the 
dimension of the system grows. 

Intuitively, the asymptotic behavior of n~^log||n„|| depends on three 
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factors. First of all, for a fixed vector v, 

n 

n"Mog ||n„z;|| = n"^ ^log 
1=1 

for a certain sequence of vectors Vi and this averaging is likely to concen- 
trate the distribution of log ||n„f || . This factor does not depend on the 
dimension N. On the other hand, we are interested in the convergence of the 
supremum of n^^ log ||n„u|| over all v G S"^, and to ensure the convergence 
of this supremum we have to make sure that variables n^^ log ||n„t;|| are all 
close to the limit E'logHXiuH for a sufficiently dense set of vectors v. The 
number of elements in such a set is likely to grow exponentially in N, and 
this might make the convergence of n^^ log ||n„|| slower for large A^. 

The third factor appears because for every fixed vector v, the norm || 
becomes concentrated around some particular value as — > oo. This fac- 
tor is likely to speed up the convergence of n~^log ||n„ti|| and therefore of 
n~Mog ||n„|| . 

We will show in this paper that the third factor dominates and the speed 
of convergence of n~^log||n„|| is not slowed down by the growth in the 
dimension N. 

Previously, the speed of convergence in the Furstenberg-Kesten theorem 
was investigated in [8], [5], and [1]. They proved a central limit theorem for 
n~^/^ log ||n„|| and studied large deviations of log ||n„|| for a large class 
of random matrices. However, the results in these papers do not provide 
effectively computable bounds on the rate of convergence in limit theorems, 
and, as a consequence, do not help us to investigate how the speed of conver- 
gence changes as the dimension of matrices grows. One of the contributions 
of this paper is deriving more explicit bounds on the speed of convergence 
in limit theorems. 

Let us describe the problem in a more formal fashion. Consider indepen- 
dent identically-distributed N-hy-N matrices X^^^ . We are interested in the 
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behavior of the norm of the product n„ = . . . x[^'^ , and we will make 

the following assumptions about matrices xl^\ First of all, we assume that 
random matrices X^^'' arc rotationally invariant; that is, the distribution of 
their entries docs not depend on the choice of coordinates. Formally, we use 
the following definition: 

Definition 1 A random matrix X is rotationally invariant if for every in- 
teger k>l, for every collection of vectors {vi, Wi\ , z = 1, /c, and for every 
orthogonal matrix U, the joint distributions of random vectors {{wi, Xvi)}'^^^ 
and {{Uwi, XUvi)}^^i "'^^ same. 

Assumption A ("rotational invariance" ) Matrices X^^^ are rotation- 
ally invariant. 

We also impose an assumption needed for the validity of the Furstenberg- 
Kesten theorem. 

Assumption B ( "Furstenberg-Kesten" ) For all N, E log 

ists. 

Second, we restrict our study to two important cases. The first one is the 
case of (real) Gaussian matrices x\^\ that is, independent random N-hy- 
N matrices with independent entries distributed according to the Gaussian 
distribution with zero mean and variance cr^/iV, i.e., as J\f (0, cr^/iV) . 

The second case is that of independent rotationally invariant N-hj-N ma- 
trices X^^'' that satisfy the following assumptions. Let s^!'^^ be the eigen- 
values of (i.e., squared singular values of X^^^), and let 

k-l 

(We will sometimes omit superscripts to lighten the notation.) 

Assumption C ("uniformly bounded singular values") With proba- 
bility 1, maxfeS^*'^^ < bs'^^'^\ where the constant b does not depend 
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on N. 

In other form, Assumption C says that 



X} 



(N) 



<b^tr 



(Af)*^(Af) 



with probabihty 1. 



Assumption D ("comparability across A^") Var 



loffs(*'^) 



exists and 



bounded by a constant which does not depend on A^. 



One example of a matrix family that satisfies these assumptions is Hermi- 
tian matrices X^^^^ which are generated in the following way. Sample A^ in- 
dependent values from a distribution supported on [a, f3] , where P > a > 0, 
and construct a diagonal matrix D^^^ by putting these values on the main 
diagonal. Then take a Haar-distributed random orthogonal matrix U- ^^ and 
define X^^^ as D^'^^U^^\ A sequence of these matrices (with independent 
U-^^) will satisfy all the assumptions. 

The main result is as follows: 

Theorem 2 Let X^^^ be independent, identically distributed N x N matri- 
ces, which satisfy assumptios A and B and which are either Gaussian with 
independent entries M (0, a'^ /N) , or satisfy Assumptions C and D . Let 
n„ = X^^ . . . X^^'^ and let v be an arbitrary unit vector. Then n^^ log ||n„|| 

and the convergence is uniform in 



converges in probability to Elog A| 'v 
N. That is, for each 5 > 0, there exists an uq (5) such that for all n > uq 
and all N > 1, 



(1) 



Pr. 



n Mog||n„| 



£;iog x[^\ 



> 



The assumptions of the theorem are sufficient but not necessary. The 
assumption that < 6s is used in the proof of Proposition [3] below, where 
it is used to estimate the probability of large deviations of log X^^\ 
and to show that the rate in the corresponding exponential inequality is 
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proportional to N. It is likely that this assumption can be somewhat relaxed 
by requiring instead that Pr {s^/s > b + u} < ce~'^'^'^. 

One particular implication of the assumption < b's is that the bound 
on singular values docs not depend on the dimension of the matrix. In order 
to understand this assumption better, consider the following example. Let 

where {xi\ is a Haar-distributcd row A^-dimensional vector, and \yi) is a 
Haar-distributed column A^-dimensional vector. (Vectors and \yi) are 
assumed to be independent.) Then the squared singular values of Xi are all 
zero except one, which equals N. Hence s(*'^) = 1 and logs^*'^) = 0. We 
can conclude that Assumptions A, B, and D are satisfied, and Assumption 
C is not satisfied. 

(N) 2 

Next, consider 'v , where v is an arbitrary vector. It is easy to see 
that this random variable is distributed as 

N{ui)\ 

where ui is the first coordinate of a Haar-distributed vector u. In other 
words is distributed as 

{Y,^ + ...Y^)/N' 

where Yi are independent standard Gaussian variables. Using this facts, it 
is possible to check that 

lim £;iog||Xii;||^ = £;iog(yi)^ G (-oo,0). 

N~^oo 

1 1 1 2 

Next, let us compute n" log ||n„|| . Note that 
and 

H;H„ = TV" \xi) {Xn\yn-lf . . . {X2\yif {xi\ . 
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Hence, 



_i 11 ,,2 logiV -isr^-, c 
n log n„ = hn > log^j, 

n ^ — ^ 



where are independent and distributed as N (ui)^ above. Hence, con- 



verges in distribution to as ^ oo. It is clear that 



n-l 



n ^^log^i ^£^log||Xif 



i=l 



in probability as n — > cxd. Therefore, for large N, 



log II H 



2 



E log \\Xiv 



logiV 



n 



n 



This bias term cannot be made small uniformly in by an increase in n. 
This means that the claim of Theorem [2] fails in this case. 

Later, in Section [3l we will prove a necessary condition for the uniform 
convergence by using the basic idea of this example. 

In order to understand the role of the rotational invariance assumption, 
consider the following example. 

Let Xi be independent, identically distributed, diagonal matrices. The 
diagonal elements of a matrix Xi are independent Bernoulli variables that 
take values a and b. That is, a diagonal element takes the value 6 > with 
probability p and the value o > with probability q = 1 — p. Assume that 
b> a. 

It is easy to see that the norm of n„ = Xi . . . X„ is given by the following 
expression: 



where a, + /3j = n, and /3j are independent random variables with the bino- 
mial distribution B (p, n) . 

Taking the logarithm and dividing by n, we get: 
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where /3j = Note that as n grows, each /3j approaches the Gaussian 

distribution J\f {p,pq/n) . 

If N is fixed, then limn^^ log ||n„|| = log a + plog (b/a) . However, if N 
grows simultaneously with n, then the limit of n~"^ log ||n„|| may be non- 
existent, or may depend on the speed of growth in relative to the speed 
of growth in n. Hence, the conclusion of Theorem [2] is invalid in this case. 

It is an interesting problem whether the assumption of rotational invari- 
ance can be relaxed so that the result in Theorem [2] holds for a larger class 
of matrices, for example, for matrices with i.i.d. non-Gaussian entries (i.e., 
Wigner matrices). However, this problem appears to be hard since at this 
moment very little is known about effective bounds on the rate of conver- 
gence in the Furstenberg-Kesten theorem. 

Let me now explain two results which will be used as tools in the proof 
of Theorem [2j The proofs of these results will be given in later sections. 

Our main tool is the following proposition. 

Proposition 3 (i) Suppose that all Xi are Gaussian with independent en- 
tries J\f (O, a'^/N). Then for all sufficiently small t, all N > Ni (t) and all 
n > 1, 

Pr 



> t} <2exp [ --Nnr 



-log \\Unv\\ - logo- 
n 

(a) Suppose that i.i.d. N-by-N matrices Xi are rotationally invariant and 
satisfy Assumption C with constant b. Let 

-I N -I 



N i-^^ " N 

k—l 

Then for all t G (0, 1/4), all N > Ni {t) and all n > 1, 

1 



Pr ■ 



1 1 " 

-log||n„i;||--^logs(*'^) 
n n ^ 



>t} <2exp[-^Nnt^ 



In its essence. Proposition [3] is a large deviation result which quantifies 
the speed of convergence of log ||n„T;|| for a fixed vector v. Its main 
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point is that the rate in this large deviation estimate is proportional to the 
dimension N. The proof of this proposition will be given in Section [2l 

The other tool is as follows. Let a set of points on the unit sphere in 
be called an e -net if the sphere is covered by spherical caps with centers at 
these points and angular radius e. 

Proposition 4 Let A be an arbitrary N-by-N matrix. Suppose that the end- 
points of vectors Vi form an e-net of the unit sphere in M^. Then, for all 
sufficiently small e 

log ||j4|| < maxlog \\Avi\\ + 2e. 

i 

This proposition allows us to control the matrix norm ||n„|| by the norms 
of vectors ||n„Uj|| , where Vi runs through a finite set of values. 

Proof: Let Vi be a vector in the net which is closest to a unit vector v. 
Then 

\\Av\\ < \\Avi\\ + \\A{v-Vi)\\ 
< \\Avi\\ + e \\A\\ . 

Taking the supremum over v, we obtain that 

(1 — e) ||yl|| < max \\Avi\\ . 

i 

Hence, 

log ll^ll < maxlog ll^fill — log (1 — e) , 

i 

and the claim of the proposition follows. QED. 

This proposition is useful in conjunction with the following result about 
the size of sphere coverings. By Lemma 2.6 on page 7 of [6], for e smaller than 
a certain constant, there exists an e-net with cardinality M < exp (A^log (3/e)) . 

Now let us prove Theorem [2] by using Propositions [3] and HI 
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Proof of Theorem [21 We focus on the case when Assumptions C and D 
hold. The proof for the case of Gaussian matrices goes along a similar route 
and it is simpler. 

First of all, note that is enough to prove that ([T]) holds for all sufficiently 
large A^, i.e., for all > A^o {^) ■ Indeed, for each A^ < Nq we can apply 
results in [3j and find that inequality ([T]) holds if n > n (5, A^) . Hence, 
inequality ([T]) holds for all A^ < A^o, provided that 

n > no {6) = ^ max {n (5, A^)} . 

We will choose an appropriate A^o (^) later. 

We are going to prove that for all sufficiently large A^ and n, (i.e., A^ > 
A'2 (5) and all n > 712 ((5)), it is true that 



(2) 



Pr ■ 



-iog||n„f 

n 



1 



n 



i=l 



> — ><—. 



10 



10 



Let vectors Vj, j = 1, . . . , M, form an ((5/100)-net on the unit sphere. Then, 
by using Propositions H] and O the union bound and the estimate on the 
number of elements in the net we obtain: 

Prl-loglln.f-iElog^^""'^ >4il 
\n lOj 

-yiogs(^'^) >— I 

100/ 

< 2exp{(log(^)-cn(4)')Arj, 

where c is a certain constant. Clearly, we can choose 722 {5) in such a way 
that for all n > n2 (S) , it is true that 



< Pr < max 



- log ||n„i;i||^ 
n 



log 



cn 



< a <0 



Viooy 

for some a, and then choose A2 (S) , such that for all N > N2 (5) it is true 
that 

2exp{aA^} < ^. 
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This choice of 712 (6) and A'^2 (<^) is sufficient to ensure that ([2]) holds. 

Next, let (In = Elogs^''^\ Since variance of log s is bounded above 
by a finite constant which does not depend on N (Assumption D), therefore 
we can find such 723 (6) that for all n>n^ [5) , it is true that 



(3) 



Pr 



1 



n 



II log 4 



{N) 



d 



N 



i=l 



> 



100 



< 



100 



for all N. 

It follows that for all n > (6) and N > N2 (5) , it is true that 



(4) 



Pr 



-iog||n^ 

n 



dN 



>-><-■ 



Note that by the Purstenberg-Kesten theorem, 



(5) 



Pr 



-iog||n„f 

n 



£;iog 





S3 




6 








<5 



for all n > {6, N) . This implies that for all N > N2 (6) , there exists such 
n, that both inequalities (jH) and ([5]) hold. This implies that for all such N, 

and for all 5 < 1, the following inequality holds. 

2 



(6) 



cIn — Elog 



25 



Otherwise, the sum of the events in (jH) and ([5]) would cover all probability 
space and hence the sum of probabilities in ^ and ([5]), 26/5, would have 
to be greater then 1. This contradicts to the assumption that 5 < 1. 
Inequalities ([2]), ([3]) and ([6]) imply that 



Pr 



1 



n 



log i|n„f- Slog iiXiui 

for all n > riQ [5) and N > Nq [6) , where uq and Nq are sufficiently large 
functions of 5. QED. 

It remains to complete the proof by proving Proposition [3l We will do 
this in the next section. 

The rest of the paper consists of Section [21 which is devoted to the proof 
of Proposition O Section [3l which gives a necessary condition for uniform 
convergence in Furstenberg-Kesten theorem, and Section HI which concludes. 
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2. A large deviation bound for the dilation of a fixed vector 

. Everywhere in this section, we assume that random matrices Xi are 
independent, identically distributed, and rotationally invariant, and that 
Ilj = XiXi-i...Xi. Let us consider the following random variables: 



It is known (e.g., [J) that the random variables yi are independent and 
identically distributed. Their distribution coincides with the distribution of 
log , where v is an arbitrary unit vector. 

2.1. Gaussian matrices. In this section we consider an important case 
when each matrix Xi has independent Gaussian entries distributed according 
to M (0,a'^ /N) . In this case, log is distributed in the same way as 

the random variable 



where 1^ are independent standard Gaussian variables. In order to prove 
Proposition [3] for this case, it is enough to show that the following result 
holds. 

Proposition 5 Let yi be independent copies of the variable 



Ift<l, then there exists a function Nq (t) such that for all N > Nq (t) and 
all n, the following inequality holds: 
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Proof: First of all, let us compute 




13 



where z is a real number. By explicit calculation, 

2^r(f + z 



N \ ^ 
2 

i 

1=1 



r(f) 



where F (z) is the Gamma function. This formula is valid for z > —N/2. 

Let z = aN, where a > —1/2. Then using the Stirling formula for large 
N, we can write: 
(7) 



TV \ ^ 
2 1 



1 



iV"^ exp 



u=l 



1 



+ a log(l + 2Q) - a 



N 



Note that for ah a > 0, (^^ + aj log (1 + 2a) — a < o? , and for ah a > —1/2, 
(i + aj log (1 + 2a) — a < 20? with equalities only for q = 0. 

If t > 0, we set a = t/2 and z = (t/2) N, and use the fact that for all 
sufficiently large A^, the asymptotic term in ([7]) dominates all other terms. 
Hence, we obtain the estimate: 

e-^'Eey' < exp . 
If t G (—1; 0) then we can take a = t/A and z = (t/4) and we obtain: 

e-t^Ee^^ < exp (^-t^N/s) . 

By standard arguments we can translate these inequalities into statements 
about probabilities of large deviations. If < t < 1, then 



Pr 



1 



n 



i=l 



>t><2e~ 



QED. 



2.2. Matrices with uniformly bounded singular values. In this section we 
are going to prove the second part of Proposition [3l Since Xi are i.i.d and ro- 
tationally invariant, therefore the distribution of yi = log ^||njt;||^ / ||nj_it;| 
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coincides with the distribution of log and equals the distribution 

of the random variable y = X^fcLi Sku\. Here Uk are components of the ran- 
dom vector n, which is uniformly distributed on the unit sphere and which 
is independent of Sk- 

Let us start with considering large deviations of x = Y^k=i Skul- Let 

Proposition 6 Suppose that with probability 1, \sk\ < B for all k. Then for 

all t > 0, 

(8) 

r N 1 f ^ V 

< exp 



max 



PT^J2skul<s-tj,Pr^J2skul>s + t^^ 



Nt" 



4B {B + 1) 



Proof: Let x denote Ylk=i ^ku\ ^^"^ estimate Pr > sW + t] . 

We will estimate the conditional probability Pr |x > s^^-* +t \ si, sn^ , 
which we denote as Pr{x > s + 1} for simplicity. Note that 

Pr{x>s + t} < e- Ee'^ 

= e-^(^+*) ("l + Miz + ^Msz^ + .-.V 



where z > and Mp = Ex^. 



Let us use von Neumann's formulas from [9] (pages 373-375) for the un- 
centered moments of the random variable x. Namely, let 



1 ^ 

= ^ 51 (^i^ ' 
1=1 



and let 

1 + /3iz p^z'^ + Ps^'^ + ... = e 
Then, von Neumann's result is that 

N{N + 2)...{N + 2k-2y^' 
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Using this result, we write: 



1 + Miz + —^M2Z^ + 



2z 



N 



2z 



N 



cxp < ai 



+ 02 



+ ...>. 



Next, note that 2ai/N = s, and that at < (iV/2) B''. This imphes 
that 



iV 

T 



+ 



25z 



+ ... 



e ^*exp 



e ^* exp 



2 ^ 



4 1 2Bz 
^ N 



2„2 



N-2Bz 



Let 



Then 



2B {B + 1) 



N - 2Bzo 



Zot 



Altogether, we get: 



Pr {x > s + t} < exp 



4B {B + t)' 



AB{B + t)j' 

The proof of the inequality for Pv {x < s — t} is similar. QED. 

Corollary 7 Suppose that with probability 1, < bs for all k. Then for all 

t > 0, 

(9) 



max 



N 



N 



Pv\J2^kul<s{l-t)\,Pi\J2^kul>s{l + t) 



< exp 



'46 {b + t) 
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Corollary 8 Let < Sk <b for each k, and t G (0, 1/2) . Then, 

(10) 

(U) 



N 



Pr < log Skul > log s + t [ < exp 



k=l 



46(6 + t) J ' 



(11) Fv i^log J2 Skul> log s-{2log 2) <exp. 
(in) 

(12) Pr <( log > : Skui - log s > t ;> < 2 exp 
where c = (2 log 2) 6. 



4b{b + t) 



log ^ SfcUfc - log s 
k=l 



Ac{c + t) 



Proof: Let x denote Y^k=i Sk^l- Then 

Fi{x> s + t} = Pr |log X > log s + log ^1 + - 

> Pr |log X > log s + -| . 

This and ([8]) proves the first inequality The second inequality is proved 
similarly, and the third one is a consequence of the first two inequalities. 
QED. 

Lemma 9 Suppose that X is a random variable such that 

{ Nt^ 1 

Pr{|X| > t} < 2exp 



4c(c + t) J ' 



where c > 0. Let \z\ < N/{16c). Then 



Ee^^ < ^/32^J^ exp ) + Se'^l/v^ + 2exp (-^ 
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Proof: Consider the case when z >0. First, let us estimate e^*// (dt) , 

where fj, is the distribution measure of X. Let F (t) =: Pr {X > t} . Then, 
by integrating by parts and using the inequalities 

F{t) < 2expl--4^ — -1 

and AT > 1, we get 




In order to estimate the integral in the last line, we divide it into two pieces, 
/i/x/jv and . Then, 

Next, for the second piece, we have: 




where we used the assumption that z < N/ (16c) . 

Hence, combining the previous inequalities and using the assumption that 
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z < N I (16c) again, we get: 



/■oo 



In addition, 



l/[4c(c+l)]g./v^^2zi 



/87rc2 



exp 



2c2 



+2exp(--). 



Combining all the parts, we get: 

' ( 2c2 o\ 



e^V {dt) < 



/327rc2z2 



■ exp 



N 



+ (l + 2e-i/[4^(^+i)0e'^^ + 2exp 



167 ' 



from which the claim of the lemma follows for z > 0. The case when z < 
is similar. QED. 

Corollary 10 Let X = log (j2k=i Skul) - log (s) and let \z\ < iV/(16c), 
where c = (2 log 2) b. Then 



Ee'^ < V327rW— -exp I 2—- I +3e^'^'^" +2 exp I - — 



c^z^ 



\z\/VN _ 



N 



N 



N 



16 



Proof: This follows directly from Lemma [9] and inequality (jlOp . QED. 
Proof of the second part of Proposition [3t Note that 



log lin^-uii^ = ^log 



i=l 



.k=l 



where uj^'^"^ are components of independent Haar-distributed A^-vectors 
Let 

><=>°8(f:4'''"(«n')-iog(^'""' 



We aim to estimate 



Pr ■ 



i=l 



> nt} . 
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As usual, 

Pr 1^ Yi > ntj < e^"^* (-Ee^^')" , 

where z > 0. 

Note that by Assumption B, 

hence, our previous lemmas are applicable. 

We set z = tN/ (4c^) and assume that AT > A/t^. (Note that the assump- 
tion that t G (0,1/4] implies that z < N/ (16c) .) Then, by the previous 
Lemma, we have: 



Ee^^^ < VmW^exph^J +3e^/^ + 2exp('-^ 



Since N > A/t^, then the first term dominates the other two terms, and we 
can write: 

Se^^^< (V2^^/^ + 5]exp[|^ 



Hence, 



= exp {-n [- log (t/c) Vn + 5^ + {f/Sc^^ a] | . 



8^ 



Clearly we can find an Aq {t) such that for all N > Nq (t) 
^-nzt (^Ee'^i^'' < exp {-n (t'^/lQc^) a} . 
Hence, for all N > No (t) 

(13) 1^ E [yf^ - log (^^''"^01 > - (^Vl6c') A^} • 
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The case of the inequality 
' 1 



(14) I ^ E [yf ^ - log (^^''"^0] < - ^""p (^Vi6c') n} . 

is similar. Finally, note that 16c^ < 326^. QED. 

3. Necessary condition. Let us introduce the following assumption. 

r (N) 1 ^ 

Assumption D' E log X- 'u exists and bounded by a constant that 
does not depend on A'^. 

Theorem 11 Let Assumptions A, B, and D' hold. Suppose that for every 
5 > there exists such an no {5) that 



>5]<5 



(15) Pr{|n-Mog||n„|| -£;iog 

for all N and all n > uq (5) . Let b (N) is an arbitrary function of N such 
that lim7v->oo b (N) = +oo. Then 

lim Pr{||x{^^|| > 6(Ar)} =0. 



Proof: Let vq be such a unit vector that 

.(TV) 



Note that 



X\ vq has the Haar distribution by assumption of rotational invariance. 
By using the fact that ||n„|| > ||n„t;o|| , we can write the inequality 



n Mog||n„ 



> 



log X\ 



.(TV) 



1 



n 



+ -^log X 



i=2 



'(AO. 



where Ui are independent Haar-distributed vectors. By using assumption 
-D', we can conclude that X^^^2 log converges in probability 



to £^log 



X 



and that the convergence is uniform in N. This fact and 



X 



(TV) 



must converge in 



the supposition of the theorem imply that log 
probability to zero as n — ^ oo, and that the convergence must be uniform in 
N. If the conclusion of the theorem were invalid, then for some 5 > and all 
n, we could find an N = N (n, S) such that Pr jlog xj^^ > n5^ > S, and 
this would contradict the uniform convergence of n~^log xj^-*! to zero. 
QED. 
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4. Conclusion. In this paper, we found sufficient conditions that en- 
sure that the convergence rate in the Furstenberg-Kesten theorem is uniform 
with respect to the dimension of the space in which matrices operate. Let 
us call this phenomenon dimensional uniformity of convergence. 

Several interesting questions remain to be answered. First, is it possible to 
prove the dimensional uniformity of convergence for random matrices which 
are not rotationally invariant, for example, for Wigner matrices? 

Second, assuming rotational invariancc, what characterises the laws of 
singular values s^^'^\ for which the dimensional uniformity of convergence 
holds? In other words, what are necessary and sufficient conditions for di- 
mensional uniformity of convergence? 
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